15 research outputs found
Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering
Many vision and language tasks require commonsense reasoning beyond
data-driven image and natural language processing. Here we adopt Visual
Question Answering (VQA) as an example task, where a system is expected to
answer a question in natural language about an image. Current state-of-the-art
systems attempted to solve the task using deep neural architectures and
achieved promising performance. However, the resulting systems are generally
opaque and they struggle in understanding questions for which extra knowledge
is required. In this paper, we present an explicit reasoning layer on top of a
set of penultimate neural network based systems. The reasoning layer enables
reasoning and answering questions where additional knowledge is required, and
at the same time provides an interpretable interface to the end users.
Specifically, the reasoning layer adopts a Probabilistic Soft Logic (PSL) based
engine to reason over a basket of inputs: visual relations, the semantic parse
of the question, and background ontological knowledge from word2vec and
ConceptNet. Experimental analysis of the answers and the key evidential
predicates generated on the VQA dataset validate our approach.Comment: 9 pages, 3 figures, AAAI 201
LoNLI: An Extensible Framework for Testing Diverse Logical Reasoning Capabilities for NLI
Natural Language Inference (NLI) is considered a representative task to test
natural language understanding (NLU). In this work, we propose an extensible
framework to collectively yet categorically test diverse Logical reasoning
capabilities required for NLI (and by extension, NLU). Motivated by behavioral
testing, we create a semi-synthetic large test-bench (363 templates, 363k
examples) and an associated framework that offers following utilities: 1)
individually test and analyze reasoning capabilities along 17 reasoning
dimensions (including pragmatic reasoning), 2) design experiments to study
cross-capability information content (leave one out or bring one in); and 3)
the synthetic nature enable us to control for artifacts and biases. The
inherited power of automated test case instantiation from free-form natural
language templates (using CheckList), and a well-defined taxonomy of
capabilities enable us to extend to (cognitively) harder test cases while
varying the complexity of natural language. Through our analysis of
state-of-the-art NLI systems, we observe that our benchmark is indeed hard (and
non-trivial even with training on additional resources). Some capabilities
stand out as harder. Further fine-grained analysis and fine-tuning experiments
reveal more insights about these capabilities and the models -- supporting and
extending previous observations. Towards the end we also perform an user-study,
to investigate whether behavioral information can be utilised to generalize
much better for some models compared to others.Comment: arXiv admin note: substantial text overlap with arXiv:2107.0722
Tricking LLMs into Disobedience: Understanding, Analyzing, and Preventing Jailbreaks
Recent explorations with commercial Large Language Models (LLMs) have shown
that non-expert users can jailbreak LLMs by simply manipulating the prompts;
resulting in degenerate output behavior, privacy and security breaches,
offensive outputs, and violations of content regulator policies. Limited formal
studies have been carried out to formalize and analyze these attacks and their
mitigations. We bridge this gap by proposing a formalism and a taxonomy of
known (and possible) jailbreaks. We perform a survey of existing jailbreak
methods and their effectiveness on open-source and commercial LLMs (such as GPT
3.5, OPT, BLOOM, and FLAN-T5-xxl). We further propose a limited set of prompt
guards and discuss their effectiveness against known attack types
Multilingual CheckList: Generation and Evaluation
The recently proposed CheckList (Riberio et al,. 2020) approach to evaluation
of NLP systems has revealed high failure rates for basic capabilities for
multiple state-of-the-art and commercial models. However, the CheckList
creation process is manual which creates a bottleneck towards creation of
multilingual CheckLists catering 100s of languages. In this work, we explore
multiple approaches to generate and evaluate the quality of Multilingual
CheckList. We device an algorithm -- Automated Multilingual Checklist
Generation (AMCG) for automatically transferring a CheckList from a source to a
target language that relies on a reasonable machine translation system. We then
compare the CheckList generated by AMCG with CheckLists generated with
different levels of human intervention. Through in-depth crosslingual
experiments between English and Hindi, and broad multilingual experiments
spanning 11 languages, we show that the automatic approach can provide accurate
estimates of failure rates of a model across capabilities, as would a
human-verified CheckList, and better than CheckLists generated by humans from
scratch
Melting of the vortex lattice through intermediate hexatic fluid in a-MoGe thin film
The hexatic fluid refers to a phase in between a solid and a liquid which has
short range positional order but quasi-long range orientational order. In the
celebrated theory of Berezinskii, Kosterlitz and Thouless and subsequently
refined by Halperin, Nelson and Young, it was predicted that a 2-dimensional
hexagonal solid can melt in two steps: first, through a transformation from a
solid to a hexatic fluid which retains quasi long range orientational order and
then from a hexatic fluid to an isotropic liquid. In this paper, using a
combination of real space imaging and transport measurements we show that the
2-dimensional vortex lattice in a-MoGe thin film follows this sequence of
melting as the magnetic field is increased. Identifying the signatures of
various transitions on the bulk transport properties of the superconductor, we
construct a vortex phase diagram for a two dimensional superconductor.Comment: New Data added in this versio
Explainable Image Understanding Using Vision and Reasoning
Image Understanding is fundamental to intelligent agents.Researchers have explored Caption Generation and VisualQuestion Answering as independent aspects of Image Understanding (Johnson et al. 2015; Xiong, Merity, and Socher2016). Common to most of the successful approaches, are the learning of end-to-end signal mapping (image-to-caption, image and question to answer). The accuracy is impressive. It is also important to explain a decision to end-user(justify the results, and rectify based on feedback). Very recently, there has been some focus (Hendricks et al. 2016;Liu et al. ) on explaining some aspects of the learning systems. In my research, I look towards building explainableImage Understanding systems that can be used to generate captions and answer questions. Humans learn both from examples (learning) and by reading (knowledge). Inspired by such an intuition, researchers have constructed Knowledge-Bases that encode (probabilistic) commonsense and background knowledge. In this work, we look towards efficiently using this probabilistic knowledge on top of machine learning capabilities, to rectify noise in visual detections and generate captions or answers to posed questions